• Tuesday, September 3, 2024

    Frequent failures and partial failures are inherent in distributed systems, requiring more robust design and the ability to take on increased costs. In distributed systems, careful coordination requires the help of data locality and partial availability.

  • Wednesday, June 5, 2024

    Modern computers aren't powerful enough to negate the need for distributed systems. While many workloads can fit on a single powerful machine, distributed systems still offer advantages like improved availability, durability, and isolation. Single-machine systems may seem simpler, but coordination issues start to arise in larger organizations.

  • Tuesday, August 13, 2024

    Distributed systems are naturally relational. Function invocations in distributed systems can be implemented as triggers on relational tables, where data updates trigger function executions. This model allows for efficient parallel processing, eliminates the need for manual coordination between systems, and makes writing code easier.

    Hi Impact
  • Wednesday, August 28, 2024

    There are significant changes happening with distributed systems. These changes will influence how systems are operated and how they are programmed. This article shares insights into the changes in transactional and analytical systems, especially around object storage and programming models. There are many possible disruptive technologies, so it is challenging to pick the winners and losers.

  • Friday, September 27, 2024

    In the realm of software engineering, certain principles emerge from experience, often learned the hard way. A recent article highlights four key software design principles that can significantly impact the development process and the reliability of software systems. The first principle emphasizes the importance of maintaining a single source of truth. When data is stored in multiple locations, the risk of inconsistencies increases. For instance, in a frontend application displaying a bank balance, it is advisable to retrieve the balance directly from the server rather than storing it in multiple places. This approach minimizes synchronization issues and ensures that derived values, like a spendable balance, are calculated on-the-fly rather than stored separately. The overarching message is that derived data should be computed rather than duplicated to avoid potential bugs. The second principle challenges the conventional wisdom of "Don't Repeat Yourself" (DRY) by introducing the concept of "Please Repeat Yourself" (PRY). The author argues that striving for excessive reusability can lead to overly complex abstractions that lose their original purpose. Instead of forcing code into a single reusable class, it may be more effective to allow for some code duplication, which can simplify testing and maintenance. This principle acknowledges that while code reuse is valuable, it should not come at the cost of clarity and functionality. The third principle addresses the use of mocks in testing. While mocks can facilitate quick unit tests, they can also lead to issues when the mocked components do not accurately reflect the real dependencies. The author suggests that relying too heavily on mocks can compromise the reliability of tests, as they may not behave as expected in production. Instead, it is recommended to use real dependencies whenever possible, even if it means writing more comprehensive tests. This approach enhances the reliability of the software and reduces the likelihood of encountering issues in production. The final principle focuses on minimizing mutable state. The author argues that while caching and state management are essential in software development, it is crucial to evaluate what data truly needs to be stored versus what can be derived dynamically. By reducing mutable state, developers can avoid synchronization problems and streamline the development process. The principle advocates for a more straightforward approach, allowing for redundant calculations when necessary, as modern computing power can handle such tasks efficiently. These principles serve as valuable guidelines for software engineers, encouraging them to think critically about their design choices and the implications of those choices on the overall reliability and maintainability of their systems. Each principle highlights the importance of simplicity, clarity, and a thoughtful approach to software design, ultimately leading to more robust and effective software solutions.

  • Monday, August 12, 2024

    Data infrastructure projects are often quickly replaced and difficult to maintain. To prevent this, it's important to avoid "resume-driven development," where teams prioritize trendy technologies over practical needs, and the "key person dependency" problem, where only one person has all the knowledge of a system.

  • Wednesday, April 24, 2024

    For resilient payment systems, Big Tech uses idempotency keys to prevent duplicate transactions and sets short timeouts to provide quick feedback to users. Circuit breakers, like in the stock market, are used to prevent cascading failures. These companies monitor the “four golden signals” (latency, traffic, errors, and saturation) to find and fix issues before they affect users.

    Hi Impact
  • Monday, July 8, 2024

    Linus Torvalds emphasizes the importance of data structures over code in software development since good data structures lead to better code design and maintainability. This author supports this view with personal experience, describing how restructuring data in a project allowed the team to move faster in the long run. This prioritization is also how Git grew to be the dominant version control system.

  • Tuesday, April 16, 2024

    Data engineers can avoid burnout and build effective data platforms by aligning the infrastructure with business needs, automating tasks, and prioritizing reliability. They should monitor infrastructure proactively and plan for failures ahead of time.

  • Tuesday, April 23, 2024

    This author built a large-scale service and found certain principles reappearing throughout the implementation. It's useful to prioritize a single source of truth and minimize mutable state when building something from scratch. Developers should also make sure not to abstract things prematurely and not to overuse mocks when writing tests for their code.

  • Tuesday, April 9, 2024

    Distributed SQLite databases sacrifice consistency, transactions, and scalability. Traditional databases like PostgreSQL, paired with effective HTTP caching for speed, are better choices than using distributed SQLite. The upside to SQLite databases is that they are really fast, but at some point, the maintenance overhead outweighs the speed benefits.

  • Friday, May 3, 2024

    Cache coherence makes sure that data stays consistent across multiple caches in distributed systems. There are two types of cache coherence protocols: snooping and directory-based protocols. In snooping protocols, caches "listen" on a bus, updating or invalidating copies based on other caches' actions. In directory-based protocols, a central directory tracks data location and state, coordinating updates and invalidations.

    Hi Impact
  • Friday, March 29, 2024

    This author switched a side project to a Kubernetes-based infrastructure, only to find it overly complex, expensive, and difficult to manage. Despite the promise of high availability, the system suffered from slow performance, difficult debugging, and downtime during node failures. While Kubernetes can be powerful, it's important to choose the right tools for the job and not get caught up in complexity for its own sake if it's not necessary.

  • Tuesday, June 25, 2024

    Local-first software stores and processes data locally on users' devices while using the internet for backup when connected. Resilient Sync uses a simple log format to track changes and assets, allowing offline data processing and easy sync across devices. It offers independence from push notifications, the ability to load entries without knowing their content, easy detection of missing data, and the option for data replication and peer-to-peer synchronization.

  • Wednesday, August 7, 2024

    The phrase "just implementation details" often underestimates the complexity and difficulty involved in building and deploying software. Designing good software involves challenges like designing a maintainable system, having robustness and observability, and providing a good user experience. The perception that "CRUD" applications are simple is not true since they also require careful database design, production support, and handling of background jobs, user logins, and permissions.

  • Wednesday, April 17, 2024

    In programming languages, failures are systemic limitations that come from constraints and might be recoverable. Mistakes are code-based errors that violate program logic and usually need safe termination. Failures and mistakes should be handled differently by software.